An Empirical Evaluation of Probabilistic Lexicalized Tree Insertion Grammars

نویسنده

  • Rebecca Hwa
چکیده

We present an empirical study of the applicability of Probabilistic Lexicalized Tree Insertion Grammars (PLTIG), a lexicalized counterpart to Probabilistic Context-Free Grammars (PCFG), to problems in stochastic naturallanguage processing. Comparing the performance of PLTIGs with non-hierarchicalN -gram models and PCFGs, we show that PLTIG combines the best aspects of both, with language modeling capability comparable to N -grams, and improved parsing performance over its nonlexicalized counterpart. Furthermore, training of PLTIGs displays faster convergence than PCFGs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Encoding Frequency Information in Lexicalized Grammars

We address the issue of how to associate frequency information with lexicalized grammar formalisms, using Lexicalized Tree Adjoining Grammar as a representative framework. We consider systematically a number of alternative probabilistic frameworks, evaluating their adequacy from both a theoretical and empirical perspective using data from existing large treebanks. We also propose three orthogon...

متن کامل

Bilexical Grammars and a Cubic-time Probabilistic Parser

Computational linguistics has a long tradition of lexicalized grammars, in which each grammatical rule is specialized for some individual word. The earliest lexicalized rules were word-specific subcategorization frames. It is now common to find fully lexicalized versions of many grammatical formalisms, such as context-free and tree-adjoining grammars [Schabes et al. 1988]. Other formalisms, suc...

متن کامل

Structured Language Models for Automatic Speech Transcription

Speech recognition typically involves three types of models; an acoustic model, a phonetic dictionary and a language model. The primary purpose of the language model is to decide if a sentence is part of the language, and optionally how likely it is. N -gram is a common type of language model which predicts upcoming words based on a series of prior words. While efficient a problem with this typ...

متن کامل

Extraction of Tree Adjoining Grammars from a Treebank for Korean

We present the implementation of a system which extracts not only lexicalized grammars but also feature-based lexicalized grammars from Korean Sejong Treebank. We report on some practical experiments where we extract TAG grammars and tree schemata. Above all, full-scale syntactic tags and well-formed morphological analysis in Sejong Treebank allow us to extract syntactic features. In addition, ...

متن کامل

A Decoder for Probabilistic Synchronous Tree Insertion Grammars

Synchronous tree insertion grammars (STIG) are formal models for syntaxbased machine translation. We formalize a decoder for probabilistic STIG; the decoder transforms every source-language string into a target-language tree and calculates the probability of this transformation.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998